# Risk

The future return that you will get is obviously unknown unless you have a crystal ball. We can, however, look at the past data of the return and get an idea of the distribution of the return. In this way we can calculate the expected return (average return) and how much the data variate from this value (standard deviation of the return). 

An investment therefore has an <b>expected return</b>, which is the value that we expect to have moreorless, and a <b>volatility</b>, which is a measure of how much we can get less or more that the expected return.

Therefore, looking at past data and supposing that our investment in the future will behave in the same way, we can calculate the exected return as the average and the risk as the standard deviation, i.e. the square root of the variance.

For example, consider the returns of these two stocks with 4 years of past history.

In [1]:
import pandas as pd
# as an example, suppose ve have
stock1 = pd.Series([+0.15,+0.20,-0.05,+0.10])
stock2 = pd.Series([+0.35,-0.30,+0.40,-0.05])

They have the same expected return, but definitely not the same volatility.

In [2]:
print("Expected returns: ",stock1.mean(),stock2.mean())
print("Volatility: ",round(stock1.std(),4),round(stock2.std(),4))

Expected returns:  0.1 0.1
Volatility:  0.108 0.3342


In [3]:
portfolio=0.5*stock1+0.5*stock2
print("Expected returns: ",portfolio.mean())
print("Volatility: ",round(portfolio.std(),4))

Expected returns:  0.1
Volatility:  0.1369


### Real example

In [4]:
import numpy as np
import pandas as pd
# from pandas_datareader import data as wb
import yfinance as yf

In [5]:
tickers=["PG","MSFT"] # Procter & Gamble and Microsoft

In [6]:
prices=yf.download(tickers,start="2009-1-1")["Adj Close"]
prices

[*********************100%***********************]  2 of 2 completed


Unnamed: 0_level_0,MSFT,PG
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2008-12-31,14.498389,40.375351
2009-01-02,15.162148,41.015430
2009-01-05,15.303847,40.721508
2009-01-06,15.482841,40.603958
2009-01-07,14.550593,39.892056
...,...,...
2023-03-07,254.149994,137.559998
2023-03-08,253.699997,137.580002
2023-03-09,252.320007,136.570007
2023-03-10,248.589996,137.190002


In [7]:
logret=np.log(prices/prices.shift(1)) 
logret

Unnamed: 0_level_0,MSFT,PG
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2008-12-31,,
2009-01-02,0.044765,0.015729
2009-01-05,0.009302,-0.007192
2009-01-06,0.011628,-0.002891
2009-01-07,-0.062101,-0.017688
...,...,...
2023-03-07,-0.010645,-0.020079
2023-03-08,-0.001772,0.000145
2023-03-09,-0.005454,-0.007368
2023-03-10,-0.014893,0.004529


In [8]:
print(logret.mean()) # average daily logret

MSFT    0.000802
PG      0.000344
dtype: float64


In [9]:
print(logret.mean()*250) # yearly return

MSFT    0.200377
PG      0.086090
dtype: float64


In [10]:
print(logret.std()) # standard deviation of daily logret

MSFT    0.017012
PG      0.011310
dtype: float64


In [11]:
portfolioLogret=np.dot(logret,[0.5,0.5])
print(portfolioLogret[1:].mean()*250)
print(portfolioLogret[1:].std())

0.14323312844171146
0.012049012658518574


In [12]:
logret.corr() # correlation 

Unnamed: 0,MSFT,PG
MSFT,1.0,0.425009
PG,0.425009,1.0


In [13]:
logret.cov() # covariance and variances

Unnamed: 0,MSFT,PG
MSFT,0.000289,8.2e-05
PG,8.2e-05,0.000128


## Calculating the portfolio's expected return knowing only the stock's average returns (and not all the daily stock returns)

It is very easy to calculate a portfolio's expected returns without knowing the stock's daily returns, because the expected return is an average and the average of a weighted sum is the weighted sum of the averages! Therefore instead of calculating the daily portfolio returns, then the daily portfolio returns and then do the average, we calculate the average of the daily returns of each stock and then calculate the portfolio yearly return.
$$\text{portfolio return}=w_1\cdot r_1+w_2\cdot r_2 = \sum_{i=1}^n w_i\cdot r_i$$

In [17]:
# calculate portfolio daily returns and then calculate average over period and make it yearly
portfolio_returns = 0.5 * logret["PG"] + 0.5 * logret["MSFT"]
print( portfolio_returns[1:].mean()*250 ) 

# calculate stock's returns over period and make it yearly and then calculate their weighted average
PG_average_return=logret["PG"][1:].mean()*250
MSFT_average_return=logret["MSFT"][1:].mean()*250
print( 0.5 * PG_average_return + 0.5 * MSFT_average_return ) 

0.1432331284417117
0.14323312844171157


## Calculating the portfolio's volatility knowing only variances and covariances of stocks' returns  (and not all the daily stock returns)

The risk is the standard deviation and unfortunatly for the standard deviation this formula does not work!
$$\text{portfolio volatility}\neq w_1\cdot v_1+w_2\cdot v_2 = \sum_{i=1}^n w_i\cdot v_i$$


In [18]:
# calculate portfolio daily returns and then calculate standard deviation over period
portfolio_returns = 0.5 * logret["PG"] + 0.5 * logret["MSFT"]
print( portfolio_returns[1:].std() ) 

# calculate standard deviation of stock's returns and then calculate their weighted average
PG_volatility=logret["PG"][1:].std()
MSFT_volatility=logret["MSFT"][1:].std()
print( 0.5 * PG_volatility + 0.5 * MSFT_volatility ) 

0.01205069960464635
0.01416103157689414


The formula, using the variance instead of the standard deviation (remember that variance is the square of the standard deviation) is
$$\text{variance portfolio}=w_1^2\cdot \text{var}_1+w_2^2\cdot \text{var}_2 + 2\cdot w_1\cdot w_2\cdot \text{cov}_{1,2}$$
Se we can get the variance, and thus the standard deviation, using the covariance and the variances of the stocks' returns. 

However, this easy formula becomes much more complicated for more than two stocks and it is better to express it in matrix format. If we consider the covariance matrix above and the weights vector
$$
\begin{matrix}
\begin{pmatrix}
\text{var}_1&\text{cov}_{12}\\
\text{cov}_{21}&\text{var}_2
\end{pmatrix}&&&&&
\begin{pmatrix}
w_1\\
w_2
\end{pmatrix}
\end{matrix}
$$

and we multiply them
$$
\begin{pmatrix}
\text{var}_1\cdot w_1+\text{cov}_{12}\cdot w_2\\
\text{cov}_{21}\cdot w_1+\text{var}_2\cdot w_2
\end{pmatrix}
$$

and if we multiply by the vector in this way, remembering that $\text{cov}_{12}=\text{cov}_{21}$
$$
\begin{pmatrix}
w_1&w_2
\end{pmatrix}
\cdot
\begin{pmatrix}
\text{var}_1\cdot w_1+\text{cov}_{12}\cdot w_2\\
\text{cov}_{21}\cdot w_1+\text{var}_2\cdot w_2
\end{pmatrix}
=w_1\cdot (\text{var}_1\cdot w_1+\text{cov}_{12}\cdot w_2)+w_2\cdot (\text{cov}_{21}\cdot w_1+\text{var}_2\cdot w_2)
=\text{var}_1\cdot w_1^2+2\cdot\text{cov}_{12}\cdot w_2\cdot w_1+\text{var}_2\cdot w_2^2
$$

Thus the volatility of the portfolio of $n$ stocks can be calculated using the square root of <font color=RED>$\text{variance portfolio}=\text{weights}^T\cdot\text{Cov Matrix}\cdot\text{weights}$

In [22]:
weights=[0.5,0.5]
cov_matrix=logret.cov()
var_portfolio=np.dot(weights, np.dot(cov_matrix,weights) ) 
# in theory the first weight should be transposed, but numpy.dot is very tolerant :-)
print("Portfolio's volatility is ",np.sqrt(var_portfolio))

Portfolio's volatility is  0.012050699604646336


## Systematic versus diversifiable risk

Given the formula for the variance $\text{weights}^T\cdot\text{Cov Matrix}\cdot\text{weights}$, we can split this risk into:
<ul>
    <li>systematic risk as $\sum_{i=1}^n w_i^2\cdot \text{var}_i$, i.e. for the two stocks example $\text{var}_1\cdot w_1^2+\text{var}_2\cdot w_2^2$
    <li>diversifiable risk as $\text{variance portfolio}-\text{systematic risk}$, i.e. for the two stocks example $2w_2 w_1\cdot \text{cov}_{12}$
</ul>
The diversifiable risk is the part which can be strongly reduced choosing uncorrelated stocks, i.e. stocks with small correlation and thus small covariance. On the other hand, the systematic risk is intrinsic to the stocks and cannot be removed.

Beware that here we are talking about variance and not standard deviation.

In [23]:
systematic_risk=np.dot(weights**2,logret[0:].var())
diversifiable_risk=var_portfolio-systematic_risk
print("Variance is ",var_portfolio,", systematic risk is ",systematic_risk,", while diversifiable risk is ",diversifiable_risk)

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

Oooooopssss, the list does not accept the square and not even the multiplication.

In [24]:
np.power(weights,2)

array([0.25, 0.25])

In [25]:
systematic_risk=np.dot(np.power(weights,2),logret[0:].var())
diversifiable_risk=var_portfolio-systematic_risk
print("Variance is ",var_portfolio,", systematic risk is ",systematic_risk,", while diversifiable risk is ",diversifiable_risk)

Variance is  0.00014521936096142336 , systematic risk is  0.0001043325113949995 , while diversifiable risk is  4.088684956642386e-05


In [26]:
print("Variance is ",round(var_portfolio*100,6),"/100, systematic risk is ",round(systematic_risk*100,6),"/100, while diversifiable risk is ",round(diversifiable_risk*100,6),"/100")

Variance is  0.014522 /100, systematic risk is  0.010433 /100, while diversifiable risk is  0.004089 /100
